1 In the claims: 

2 1. A remote content crawler for use in a content search, packaging, and delivery system, 

3 comprising: 

4 a remote content crawler processor that controls the remote content crawler; 

5 a network resource processor that acquires data related to resources coupled to one or 

6 more communications networks; 

7 a crawling criteria processor that acquires crawling criteria; 

8 a crawler content provider processor that receives, processes and stores content 

9 provider listings; and 

10 a network crawler, wherein the network crawler crawls content providers to acquire 

1 1 data related to available content. 

12 2. The remote content crawler of claim 1 , further comprising: 

13 a content crawler results processor; 

14 a metadata acquisition processor; 

1 5 a plurality of crawling servers coupled to the network crawler; and 

1 6 one or more databases, the one or more databases storing information and data 

1 7 generated in and received by the remote content crawler. 

18 3 . The remote content crawler of claim 2, wherein the one or more databases, 

19 comprises: 

20 a content provider listing database; 

21 a crawling criteria database; and 

22 a network resources database. 

23 4. An apparatus for searching one or more communications networks, accessing content 

24 available on the one or more communications networks, and acquiring access to the content, 

25 comprising: 

26 one or more processors, wherein the one or more processors receive information 

27 related to the content; and 

28 a network crawler coupled to the one or more processors, wherein the network 

29 crawler accesses the one or more communications networks to locate available content. 
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1 5. The apparatus of claim 4, wherein the network crawler comprises one or more 

2 crawling servers, wherein each of the one or more crawling servers searches the one or more 

3 communications networks according to a specific crawling criteria. 

4 6. The apparatus of claim 5, wherein the network crawler is a World Wide Web robot, 

5 wherein the network crawler traverses a hypertext structure of the network and retrieves the 

6 content and recursively retrieves additional content referenced in the retrieved content. 

7 7. The apparatus of claim 4, wherein the one or more processors, comprises: 

8 a crawler processor coupled to the network crawler, wherein the crawler processor 

9 receives crawling schedule information and content search criteria; 

10 a network resource processor coupled to the network crawler, wherein the network 

iff 1 1 resource processor aggregates resource addresses of resources coupled to the one or more 

Jfj 12 communications networks ; 

y 13 a crawling criteria processor that compiles data related to searches to be conducted by 

14 the network crawler and generates specific crawling criteria; and 
^15 a crawler content provider processor coupled to the network crawler that identifies, 

O 16 tracks, indexes and ranks providers of the content, and generates content provider data, 

P 17 wherein the network crawler receives the content provider data, the specific crawling criteria 

2f 1 8 and the resource addresses and crawls the network based on the received content provider 

1 9 data, the specific crawling criteria, and the resource addresses. 

20 8. The apparatus of claim 7, further comprising a content crawler results processor that 

21 receives content data from the network crawler, and that processes the content data and 

22 routes sorted and formatted crawling results for storage. 

23 9. An apparatus for finding digital content in one or more communications networks, 

24 comprising; 

25 means for building and maintaining network resource data, wherein the network 

26 resource data contains address data for content servers coupled to the one or more 

27 communications networks; 

28 means, coupled to the means for building and maintaining network resource data, for 

29 storing the network resource data; 

30 means for building and maintaining crawling criteria, wherein the crawling criteria 

3 1 are used during a crawling operation to search for the digital content; 
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1 means for building and maintaining content provider data, wherein the content 

2 provider data comprises data related to potential providers of content on the one or more 

3 communications networks; and 

4 means, coupled to the means for building and maintaining network resource data, the 

5 means for building and maintaining crawling criteria, and the means for building and 

6 maintaining content provider data, for crawling the communications network. 

7 10. The apparatus of claim 9, wherein the means for building and maintaining network 

8 resource data includes means for indexing address types. 

9 11. The apparatus of claim 10, wherein the address types include top-level domain and 

10 subdomain names, Universal Resource Identifiers, Universal Resource Locators (URLs), and 

1 1 Internet Protocol (IP) address numbers. 

12 12. The apparatus of claim 10, wherein the means for indexing address types is scalable 

13 to accommodate future naming conventions, 

14 13. The apparatus of claim 9, wherein the means for building an maintaining the network 

1 5 resource data includes means for updating the address data. 

16 14. The apparatus of claim 13, wherein the means for updating the address data, 

17 comprises: 

1 8 means for receiving hyperlinked domain names; 

19 means for downloading domain name records from public and private domain name 

20 registration databases; 

21 means for synchronizing a local Domain Name Service (DNS) database with one or 

22 more DNS databases over the one or more communications networks; 

23 means for performing reverse domain resolution by locating URLs associated with 

24 allowable IP addressing numbers; and 

25 means for verifying Domain Name Service aliases and duplicate URLs against IP 

26 addresses to eliminate redundant domain names. 

27 15. The apparatus of claim 9, wherein the network resource data comprises: 

28 URL owner identity; 

29 URL owner contact information; 

30 available content types; 

3 1 expiration time of the domain name; and 
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1 subdomain names to be excluded during crawling. 

2 16. The apparatus of claim 9, wherein the crawling criteria, comprises: 

3 terms, phrases and keywords; 

4 data type descriptions; 

5 metadata field names; and 

6 metadata type descriptors, wherein the metadata type descriptors are associated with 

7 eligible content as one or more of hypertext descriptions and embedded file and data stream 

8 attributes and metadata. 

9 1 7. The apparatus of claim 9, wherein the means for building and maintaining crawling 
1 0 criteria comprises automatic means for building and maintaining crawling criteria. 

O I* 18. The apparatus of claim 1 7, wherein the automatic means comprises: 

J 12 means for analyzing and importing metadata schemes for standardized and 

JJf 1 3 proprietary content formats; 

P 14 means for parsing metadata field names and descriptive terms; and 

m 1 5 means for analyzing hypertext associated with desired hyperlinks and for analyzing 

^ 1 6 text proximate to the desired hyperlinks, wherein the means for analyzing hypertext identify 

m 1 7 terms that relate to a data type or content category. 

y 18 19. The apparatus of claim 9, wherein the means for building and maintaining crawling 

P 1 9 criteria comprises manual means for building and maintaining crawling criteria. 

20 20. The apparatus of claim 9, further comprising means for storing the crawling criteria. 

21 21. The apparatus of claim 9, wherein the means for building and maintaining content 

22 provider data, comprises means for ranking content providers. 

23 22 . The apparatus of claim 2 1 , wherein criteria for ranking the content providers, 

24 comprises: 

25 quantity of available content; 

26 provider professional association membership; 

27 amount of content requested and downloaded by users of the communications 

28 network; and 

29 content provider ratings, wherein the content provider ratings are provided by the 

30 users of the communications network. 
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1 23 . The apparatus of claim 2 1 , wherein a ranking of a content provider determines how 

2 frequently the content provider is crawled. 

3 24. The apparatus of claim 9, wherein the means for crawling the communications 

4 network comprises one or more crawling servers, wherein the means for building and 

5 maintaining the network resource data comprises means for analyzing and subdividing the 

6 network resource data and means for providing the subdivided network resource data to the 

7 one or more crawling servers. 

8 25. The apparatus of claim 24, wherein a crawling server comprises: 

9 means for reading the subdivided network resource data; 
^ 1 0 means for communicating with a network resource; and 

m 1 1 means for requesting and downloading data from the network resource. 

m 1 2 26. The apparatus of claim 25, wherein the crawling server, further comprises: 

j=j 1 3 means for comparing the content to the crawling criteria, wherein the crawling server 

it 1 4 P rovides data related to the content when the means for comparing indicates the crawling 

* s 15 criteria are satisfied; and 

W 1 6 means for following links from a first network resource to subsequent network 

p 1 7 resources, wherein the means for following links comprises: 

g 1 8 means for analyzing hypertext structure of the first network resource to 

19 determine if the links have been crawled, 

20 means for determining if a network resource has been downloaded or updated 

2 1 since a previous crawl of the network resource, and 

22 means for analyzing the hypertext structure to determine if the link points to a 

23 network resource comprising a web page or other hypertext files. 

24 27. The apparatus of claim 26, wherein the crawling server, further comprises: 

25 means for caching hypertext files containing the data related to the content; 

26 means for caching the links from the first network resource to subsequent network 

27 resources; and 

28 means for indexing web pages and other hypertext files of interest. 

29 28 . The apparatus of claim 26, wherein the means for comparing the content to the 

30 crawling criteria comprises a comparison algorithm that compares elements in a hypertext 

3 1 file to the crawling criteria. 
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29. The apparatus of claim 9, further comprising: 

means, coupled to the means for crawling the communications network, for acquiring 
and processing metadata related to a network resource; and 

means, coupled to the means for acquiring and processing metadata related to a 
network resource, for processing content results from the crawled network resources. 

30. A method for finding digital content in a communications network, comprising: 
acquiring network resource data, wherein the network resource data comprises 

address data for content servers coupled to the one or more communications networks; 

acquiring crawling criteria, wherein crawling criteria are used during a crawling 
operation to search for the digital content; 

acquiring content provider data, wherein content provider data includes digital 
content provider-related data; and 

crawling network resources in the one or more communications networks. 

3 1 . The method of claim 30, further comprising storing the network resource data, the 
crawling criteria, and the content provider data in one or more databases. 

32. The method of claim 30, wherein acquiring network resource data comprises indexing 
the address data according to one or more address types. 

33. The method of claim 32, wherein the address types include top-level domain and 
subdomain names, Universal Resource Identifiers, Universal Resource Locators (URLs), and 
Internet Protocol (IP) address numbers. 

34. The method of claim 32, further comprising scaling the address types to 
accommodate future naming conventions. 

35. The method of claim 30, further comprising updating the address data. 

36. The method of claim 35, wherein updating the address data, comprises: 

receiving hyperlinked domain names for the network resources; 

downloading domain name records from public and private domain name registration 
sources; 

synchronizing local Domain Name Service (DNS) databases with one or more DNS 
databases over the one or more communications networks; 

performing reverse domain name resolution, comprising locating URLs associated 
with allowable IP address numbers; 
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1 verifying DNS aliases and duplicate URLs against IP addresses; and 

2 eliminating any duplicate URLs identified by the verifying step. 

3 37. The method of claim 30, wherein the network resource data comprises: 

4 URL owner identity; 

5 URL owner contact information; 

6 available content types; 

7 expiration time of the domain name; and 

8 subdomain names to be excluded during crawling. 

9 38. The apparatus of claim 30, wherein the crawling criteria, comprises : 

1 0 terms, phrases and keywords; 

1 1 data type descriptions; 

1 2 metadata field names; and 

1 3 metadata type descriptors, wherein the metadata type descriptors are associated with 

14 eligible content as one or more of hypertext descriptions and embedded file and data stream 

1 5 attributes and metadata. 

16 39. The apparatus of claim 30, wherein acquiring the crawling criteria comprises 

1 7 automatically acquiring the crawling criteria. 

1 8 40. The method of claim 39, wherein automatically acquiring the crawling criteria, 

19 comprises: 

20 analyzing and importing metadata schemes for standardized and proprietary content 

21 formats; 

22 parsing metadata field names and descriptive terms; 

23 analyzing hypertext associated with desired hyperlinks; 

24 analyzing text proximate to the desired hyperlinks, wherein analyzing hypertext 

25 identifies terms that relate to a data type or content category. 

26 41. The method of claim 3 0, wherein acquiring the crawling criteria comprises acquiring 

27 the crawling criteria through manual input. 

28 42. The method of claim 30, wherein acquiring the content provider data comprises 

29 ranking content providers . 

30 43. The method of claim 42, wherein a ranking of a content provider is based on one or 

3 1 more of quantity of available content, provider professional association membership, amount 
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of content requested and downloaded by users of the communications network, and content 
provider ratings, wherein the content provider ratings are provided by the users of the 
communications network. 

44. The method of claim 43, further comprising determining a frequency of crawling a 
content provider based on the ranking of the content provider. 

45. The method of claim 30, wherein crawling the network resources comprises crawling 
with one or more crawling servers. 

46. The method of claim 45, further comprising 
subdividing the network resources; 

assigning the subdivided network resources to the one or more crawling servers; and 

at a crawler server: 

reading data from the assigned network resources, 
communicating with the assigned network resources, 
downloading data from the assigned network resources. 

47. The method of claim 46, further comprising: 

comparing digital content from one or more of the assigned network resources to the 
crawling criteria; and 

acquiring data related to content that satisfies the crawling criteria, 

48. The method of claim 46, further comprising: 

following links from a first network resource to subsequent network resources, 
wherein following the links comprises: 

analyzing hypertext structure of the first network resource to determine if the 
links have been crawled, 

determining if a network resource has been downloaded or updated since a 
previous crawl of the network resource, and 

analyzing the hypertext structure to determine if the link points to a 
network resource comprising a web page or other hypertext file. 

49. The method of claim 48, further comprising: 

caching hypertext files containing the data related to the content; 

caching the links from the first network resource to subsequent network resources; 

and 
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1 indexing web pages or other hypertext files of interest. 

2 50. The method of claim 48, wherein comparing the content to the crawling criteria 

3 comprises using a comparison algorithm that compares elements in a hypertext file to the 

4 crawling criteria. 

5 51. The method of claim 30, further comprising: 

6 acquiring and processing metadata related to a network resource; and 

7 processing content results from the crawled network resources. 

8 52. An apparatus for controlling a remote content crawler having one or more crawling 

9 servers, the remote content crawler capable of searching one or more communications 

10 networks for data related to content available on the one or more communications networks, 

%0 1 1 the apparatus, comprising: 

iji 12 means for communicating with components of the one or more communications 

y 13 networks; 

M= 14 means, coupled to the communications means, for executing crawling of the one or 

15 more communications networks by the remote content crawler; 
y 16 means, coupled to the means for executing crawling, for routing data received by the 

Q 17 remote content crawler; and 

£ S 2 

S 1 8 means, coupled to the data routing means, for aggregating data related to resources of 

1 9 the one or more communications networks, wherein the remote content crawler uses the 

20 aggregated data to search the one or more communications networks. 

21 53. The apparatus of claim 52, further comprising: 

22 means, coupled to the communications means, for building a crawling criteria 

23 database, wherein the crawling criteria comprises one or more of hypertext search guidelines, 

24 data type list, metadata search criteria, and keyword lists. 

25 54. The apparatus of claim 52, further comprising: 

26 means for building a content provider database, wherein data related to content 

27 providers is tracked, indexed, and ranked. 

28 55. The apparatus of claim 52, further comprising: 

29 means for retrieving and routing metadata related to the content available on the one 

30 or more communications networks; and 
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means, coupled to the means for retrieving and routing the metadata related to the 
content available on the one or more communications networks, for indexing and formatting 
the retrieved metadata. 

56. The apparatus of claim 52, wherein the means for executing crawling, comprises: 
means for storing data related to crawling the one or more communications networks; 
means for initiating crawling of the one or more communications networks, the 

means for initiating crawling comprising means for receiving administrative data related to 
the crawling of the one or more communications networks; and 

means for analyzing a resource data set of the one or more communications networks 
to subdivide the resource dataset into one or more smaller resource data sets, wherein the 
subdivision is based on one or more of overall size of the resource data set, and a number of 
available crawling servers. 

57. The apparatus of claim 57, wherein the means for executing crawling further 
comprises: 

means for determining if contents of a hypertext files meet conditions of crawling 

criteria, comprising: 

means for parsing the contents of the hypertext files, and 

means for comparing the parsed content to the criteria in a criteria database, 

wherein if a hypertext file contains sufficient matching data, the hypertext file is cached. 
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